Articulatory Prosodies in German Reduced Speech
نویسنده
چکیده
Starting from spontaneous speech data of the Kiel Corpus, reduction patterns of function words are described, which also incorporate more global articulator-y prosodies, such as nasality, labiality and glottalization. The results of 4 perceptual experiments support the hypothesis that these long components of speech production are mapped onto perception. The discussion is also a contribution to a new paradigm for the analysis of non-lab and non-scripted speech. 1. PRODUCTION PATTERNS OF FUNCTION WORDS 1.1. Articulatory Fusion and Lexicalization The phonetic form of function words in German shows great variability along an articulatory scale from strong elaboration to a high degree of reduction [7,8], depending, among other factors, on prosody, especially sentence accent as well as prosodic grouping, and context of situation. In sequences of unstressed function words, reduction may result in their articulatory fusion, e.g., of prepositions with articles (a) or of auxiliary verbs with enclitic pronouns (b). This may lead to the emergence of new lexical items, for example, in category (a), to monosyllabic “zum” [tsum] ( “to the”) by the side of bisyllabic “zu dem”, which varies along the reduction scale from [tsu: de:m] to [tsu bm]. Although related historically and with regard to articulatory reduction, the monoand bisyllabic phonetic forms today pattern differently in phrasal collocation due to the lexicalization of “zum”, as in “er kam zum Schlul3” (“he came at/to the end “) VS. “er kam zu dem SchluB, dal3...” ( “he reached the conclusion that... “). In the parallel case “mit dem” (“with the”), on the other hand, articulator-y fusion to monosyllabic [mrm], beyond the bisyllabic reduction scale from [mrt de:m] to [mr bm], does not result in a new lexicalization. Similarly, “haben/konnen/sind/sollen/wollen wir” ( “have/can/ are/shaZZ/wiZZ we”), in category (b), vary along the scales from [ha:ban vi:e]/[koenan vi:e]/[zmt vi:e]/[z4an vi:6J/[v&n vi:e] to [hame]/[k ceme]/[zIm~]/[zome]/[v3m@j, resulting in a new inflectional paradigm of fused lexical items that are all bisyllabic. This reduction tendency is particularly strong in spontaneous speech. 1.2. Articulatory Prosodies of Nasality and Labiality The Kiel Corpus [2] contains the example “nun wollen wir ma1 kucken” ( “now let’s see”) in the phonetic form [nu: X”6 ma ‘khukn] for unreduced [nu:n volan vi:n ma:1 ‘khukrJ] [7, 9, lo]. It has strong nasalization across its first three syllables relating to syllable-final nasal consonants, which are reduced (deleted or shortened) in this hypo as against the hyper pronunciation. There is additional labiodentalization around the third syllable representing canonical [v] of “wir”. Other possible realizations are [nu: 5(rrjm)5 ma ‘khukn] [lo], where the apical gesture of the medial nasal is also eliminated or the consonant deleted altogether. So in these fusions of function words articulatory residues may persist as non-linear, suprasegmental features of syllables, reflecting, e.g., nasality or labiality that is no longer tied to specific segmental units. In those cases where the vowel in the first syllable bears a close acoustic relationship to the vowel in the second syllable, i.e. [ae E] [D e] in [k cemv]/[zDme]/[vome], the reduction can go further to a nasalised monosyllabic realization [kcjZ(:)]/[z5(:)]/[v5(:)], in, e.g., “konnen/sollen/wollen wir das machen” ( “can/shaZ~wiZZ we do this”). Thus “nun wollen wir mal kucken” may also be expected to be realised as [nu: 5(:) ma ‘khukg]. So nasalization may be the only articulatory parameter left to differentiate the production of “sollen wir das machen” [z% das ‘maxan] from “sol1 er das machen” [ZX das ‘maxan] (“is he to do it”), which lacks it. The same nasal/oral dichotomy may apply to “sollen sie” [zszi] (“are they to “) vs. [zDzi] ( “is she to “). 1.3. Phonatory Prosody of Glottalization Besides the articulatory prosodies of nasality and labiality, the verbal paradigm also makes use of glottalization to differentiate “konnten/sollten/wollten wir/sie” ( “couZd/shouZd/wouZd we/they “) vs. “konnen/sollen/wollen wir/sie”. Instead of stopping the air stream for [t] by velic action in a nasal context, the velum may be lowered throughout, and the signalling of a break, as for a plosive, may then be achieved by glottal activity, in the extreme case by a glottal stop, but irregular glottal vibration is equally possible at any point during the nasal segment [5]. Thus [koem~]/[z~ll?~]/[v~~~], [keen zi]/[zDn zi]/[vDn zi] (alternatively with an additional syllabic modal-voice nasal after the glottalized part) are possible realizations. Furthermore, the nasal stop articulation may again be replaced by syllable nasality overlaying at least the first vowel and the glottalization then associated with its final section, as in [kc@ zi]/[z$ zi]/[v2 zi]. That is less likely to occur in the context of “wir” because in more hyper production this type of glottalization is connected with an oral occlusion, and therefore presupposes a consonantal gesture, as in the position before [zi]. But in [me], after a vowel, the elimination of a labial closing movement leads to vowelinternal glottalization. However, the consonantal link remains and the probability of occurrence increases if, within a frame of global syllable nasality, glottalization is coupled with a lip gesture into and out of an approximant stricture, e.g. in [z5@] or [z$E]. The distinction between a consonantal and a vocalic base of glottalization plays an important role in German phonology. An example of the former is the replacement of plosives in a sonorant, especially nasal environment, as outlined above; the latter functions as a word-initial boundary marker. The different vocal tract resonances for the irregular glottal pulses in the two cases are illustrated in the spectrographic analysis of “wir konnten ihn fragen” (“we could ask him”) [vie koennn jn ‘fra:gn] in Figure 1. Glottalization to mark vowel onset may start in a preceding sonorant configuration, but this “overspill” is much shorter than the actual vocalicpage 89 ICPhS99 San Francisco Figure 1. Spectrogram of “(wir) konnten ihn (fragen)“. Figure 2. Spectrogram of “(wir) konnen ihn (fragen).” base glottalization, e.g. in “wir konnen ihn fragen” (“we can ask him”) [vie koenn j:n ‘fi-a:gg] of Figure 2. 1.4. Reduction Rules Reduction of function words in German exhibits patterns which can be formulated in the following rules by reference to accented strong citation form pronunciations: (1) The degree of reduction depends on word class, morphological, syntactic and prosodic structures as well as speaking style. It is particularly high for articles and their combinations with prepositions as well as for enclitic sequences of auxiliary verbs and pronouns, in certain cases resulting in new lexicalizations. (2) Diphthongs tend towards monophthongization, long vowels towards shortening and all vowels towards more central and mid positions. In extreme cases the result is [a], or [e] when phonological /r/ is involved: “ein” [an], “de? [de], “wir” [ve], “mit dem” [mid/tom], “zum” [tsam], “zur” [tse]. (3) The glottal word boundary marker of an initial vowel may be eliminated inside unaccented article + preposition and auxiliary + enclitic pronoun constructions: “auf einen/einem”, “sol1 er”. (4) [a], including the result of (2), may be deleted, e.g. “haben/ konnen/sollen/wollen” [ha:bm]/[kaenn/[zoln]/[voln], “ein” [n], “mit dem” [mId/tm], “zum” [tsm]. Subsequent rules also apply to the segment sequences resulting from (4). (5) Interconsonantal /t/ may be deleted: “sind wir” [zm ve]. (6) Apical stop consonants are assimilated in place to following labials/dorsals, irrespective of word boundaries; apical nasals are also adjusted to preceding labials/dorsals within the same word: “haben” [ha:bm], “mit dem” [mrb/pm], “konnen/sind/sollen/wollen wir” [koen$mv~]/[zr~/mve]/[zol~/mve]/[v~lm/mvn]. (7) Final /l/ may be deleted, even before initial vowels of enclitic words: “mal”, “soll(en)“, “solch”, “Welch”, “will”, “wollen”. (8) Velic closure in lenis plosives before nasals may be cut out and [q/mv] integrated in a single bilabial nasal gesture [ml: “haben” [ha:m], “mit dem” [mIm];[hame] etc.. (9) The velic closing movement for a plosive in a nasal environment is relaxed and a prosody of irregular glottal activity produced instead: “konnten” [kcennn], “sollten wir” [zDrn(m)e]. (10) A postvocalic closing movement for a nasal consonant is reduced or totally eliminated, with nasality spreading, particularly across the preceding vowel; mid to open diphthongs may be monophthongized: “sollen wir” [ z5E] [ ~551 [ zs:], “sollten wir” [ z@E] . 2. PERCEPTION PATTERNS OF FUNCTION WORDS 2.1. A Hypothesis and a New Experimental Frame Since the production patterns found in function words are an essential feature of connected, especially spontaneous speech it must be assumed that they also play a fundamental role in speech perception. The question thus is as to how listeners make use of phonetic parameters contained in reduced speech to restore the intended words and utterances, and what relevance should be attributed to the global prosodic features of nasalization, labialization, and glottalization, as well as to articulatory residues, over and above segmental information, for correct decoding of connected speech. To test this hypothesis we need a new type of data in our perception experiments, compared with the traditional paradigm, which uses very simple stimuli of syllable or word size, often of a nonsense word type, within a standard metalinguistic sentence frame, systematically varying acoustic parameters in speech synthesis. The Haskins experiments on VOT and second formant transitions are classic examples. The aim of such perception tests is to gain insight into the perceptual relevance of specific parameter values for phoneme perception in word citation forms. None of these heavy constraints apply to the phenomena at issue at the utterance level. We first of all need speech data of at least sentence size in a natural and meaningful context because it is only there that the reductions occur and can be tested perceptually. Secondly, the acoustic quality must be completely natural, so only high-level speech synthesis or very careful t ime-domain splicing are feasible tools for signal manipulation. Thirdly, the auditory test stimuli must be modelled on production data found in large connected speech data bases, and they should be convincingly reproduced and systematically varied in natural production by competent, phonetically trained native speakers for subsequent processing and parameter variation according to new experimental designs. Finally, the perceptual test will have to put articulatory prosodies rather than segment-type phonemes in focus [ 61. Such a test frame was implemented in the following steps: 0 The spontaneous speech example discussed in 1.2 as well as glottalization data from the Eel Corpus were the start. 0 The phrases (a) “sol1 er”, “soll(t wir”; (b) “sol1 sic”, “soll(t sic”; (c) “wir konn(t)en ihn”; (d) “die konn(t)en (wir) uns” were put in the utterance frames (a,b)“Was meint ihr?-das machen?” (“What do you think? -do it?“); (c) “ fragen.” (“ -ask him. “); (d) “-abholen.” ( “-collect them Cfor ourselves). “). page 90 ICPhS99 San Francisco 0 T h e a u t h o r p r o d u c e d e a c h of t hese u t te rances wi th a b r o a d a r ray of p h r a s e rea l iza t ions f rom h y p e r to h y p o a l o n g the sca les f o u n d in c o n n e c t e d s p e e c h data . T h e reco rd ings w e r e audi tor i ly s c r e e n e d a n d the most conv inc ing r e n d e r i n g of e a c h u t te rance, p h r a s e a n d reduc t ion type se lec ted for p rocess ing . 0 T h e d a t a w e r e t h e n a n a l y s e d in xassp [3] a n d systemat ic sp l ic ing w a s a p p l i e d to c rea te test st imul i for 4 l is ten ing tests: (a ) soZZer (b ) sol ls ie, (c) konnf ra , (d ) k o n n a b . 0 E a c h test st imulus, p r e c e d e d by a 5 0 m s s ine w a r n i n g t o n e a n d fo l lowed by a 4s p a u s e , w a s c o p i e d 1 0 t imes. T h e st imul i w e r e r a n d o m i s e d a n d c o p i e d o n t o 4 s e p a r a t e a n a l o g test tapes. l Ques t i onna i res w e r e p r e p a r e d wi th th ree a l ternat ives in (a ) a n d (b ) (“so l1 e r ”, “so l len wi r”, “sol l ten wi r”; “so l1 s ie”, “so l len sic”, “sol l ten s ie”) a n d two a l ternat ives in (c) a n d (d ) (“k o n n e n ”, “k o n n t e n ”) for fo rced cho ice answers . 0 Al l 4 tests w e r e admin is te red in a sound t rea ted r o o m v ia l oudspeake r , first to a g r o u p of phone t i cs s tudents in two sessions in the o r d e r (a) , (b ) a n d (c), (d ) a n d t h e n to a n o t h e r g r o u p of phonet ica l l y na i ve s tudents in o n e sess ion. T h e comp le te test wi th in t roduc t ion a n d b r e a k s las ted a b o u t 7 5 mins. 2.2. P r o s o d i e s of Nasa l iza t ion a n d G lottal izat ion: Exp . so i ler 2.2.1. S t imulus descr ip t ion a n d resul ts. Al l st imul i for Exps. so i ler a n d sol ls ie h a v e ident ica l f rames “sd a s m a c h e n ”, taken f rom the se lec ted p roduc t i on “so l1 s ie d a s m a c h e n ” [z~r zi d a s ‘maxan] . Fo r Exp . soZZer the fo l lowing excerp ts w e r e sp l iced in to this f rame: “(s)ol l e r ” [ol C] (ser l ) , “(s)ol l e r ” [:, e ] (ser2) , “(s )o l len wi r” [o lq e] (ser3) , “(s )o l len wi r” [3 e] (ser4) , “(s)o l l ten wi r” [a l? mg] (ser i r ) , al l f rom the se lec ted s t imulus set “so l1 er /so l l ( t )en wi r d a s m a c h e n ”. T h e n s igna l man ipu la t i ons w e r e ca r r ied out . (a ) O n e p e r i o d w a s r e m o v e d f rom the cen t re of [S ] in s e r 4 to g ive this s t imulus the s a m e du ra t i on as ser2 . (b ) [5] f rom na tu ra l ss i4 w a s sp l iced in to the f rame (ser5) . T h e ss i4 vowe l w a s l e n g t h e n e d by dup l i ca t ing cent ra l p e r i o d s to g ive it t he s a m e du ra t i on as [5 E ] in s e r 4 (serd) . (c) G lotta l izat ion pu lses, exce rp ted f rom na tu ra l “(s)o l l ten wi r” [z? r r~e] , w e r e sp l iced in to the vowe l cen t re of s e r 4 a n d dup l i ca ted (se@ . (d ) In s e r 8 the s e c o n d hal f of the g lo t ta l ized sect ion as wel l as 4 p e r i o d s of the fo l lowing m o d a l vo ice w e r e r e p l a c e d by the s igna l s e g m e n t [rrj] o f the s o u r c e s t imulus in (c) (ser9) . ser l ,3 ,7 rep resen t less r e d u c e d a n d the re fo re c lear cases of “so l1 e r ”, “so l len wi r”, “sol l ten wi r”, respect ively . se r2 ,4 a r e c o n t ras ted by the a b s e n c e / p r e s e n c e of a nasa l p r o s o d y to d i f ferent ia te “so l1 e r ” f rom “so l len wi r”. ser .5 ,6 h a v e a shor t vs. l o n g nasa l i zed m o n o p h t h o n g ins tead of the d iph thong . se& ,9 in t roduce a p r o s o d y of g lot ta l izat ion in to a g loba l nasa l p r o s o d y wi thout a n d wi th lab iodenta l iza t ion . T h e resul ts of the l is ten ing test a r e in T a b l e 1. 2.2.2. Discuss ion. 0 T h e a n c h o r s ser l ,3 ,7 a r e un ique l y ident i f ied. 0 In se r2 ,4 a nasa l p r o s o d y ac ross the d i p h t h o n g c a n s igna l the “so l1 e r ”/“so l len wi r” dist inct ion, bu t n o l o n g e r un ique ly . 0 A s to the nasa l m o n o p h t h o n g s in ser5,6 , the two g r o u p s b e h a v e di f ferent ly: in Gr l “so l1 e r ” domina tes , in G r 2 “so l len wi r”, wi th fewer “so l len wi r” for the shor t vowe l t h a n for the l o n g o n e in b o t h g roups . T h e two g r o u p s ra ted the p e r c e i v e d nasal i ty di f ferent ly, p r e s u m a b l y e i ther as a s p e a k e r ’s vo ice qual i ty o r as a l inguist ic d i f ferent iator . O n e of the Gr l sub jects c o m m e n t e d af ter the test that h e h a d b e e n uncer ta in as to these in terpreta t ions. This co inc ides wi th the fact that G r l w e r e the so l1 e r so l len wi r so l l ten wi r
منابع مشابه
On the Role of Articulatory Prosodies in German Message Decoding
A theoretical framework for speech reduction is outlined in which 'coarticulation' and 'articulatory control' operate on sequences of 'opening-closing gestures' in linguistic and communicative settings, leading to suprasegmental properties - 'articulatory prosodies' - in the acoustic output. In linking this gestalt perspective in speech production to the role of phonetic detail in speech unders...
متن کاملCommunicative Functions Integrate Segments in Prosodies and Prosodies in Segments
This paper takes a new look at the traditionally established divide between sounds and prosodies, viewing it as a useful heuristics in language descriptions that focus on the segmental make- up of words. It pleads for a new approach that bridges this reified compartmentalization of speech in a more global communicative perspective. Data are presented from a German perception experiment in the f...
متن کاملOn the Interdependence of Sounds and Prosodies in Communicative Functions
Sound segments have traditionally occupied a central place in phonetic science. Other sound aspects have been conceptualized in a broader, suprasegmental frame as prosodies, especially pitch, but also energy, voice quality, rhythm. This has resulted in the current dichotomous research paradigm of sounds and prosodies. This paper takes a new look at this division, as a useful initial heuristics ...
متن کاملUsing multiple acoustic feature sets for speech recognition
In this paper, the use of multiple acoustic feature sets for speech recognition is investigated. The combination of both auditory as well as articulatory motivated features is considered. In addition to a voicing feature, we introduce a recently developed articulatory motivated feature, the spectrum derivative feature. Features are combined both directly using linear discriminant analysis (LDA)...
متن کاملPerception of phonetic detail in the identification of highly reduced words
There is great phonetic variation of words in context, conditioned by phonetic environment, word type, and speaking style in different communicative situations. Function words and modal particles are particularly susceptible to having their phonetic weight and complexity reduced, especially in casual spontaneous speech. But even if whole strings of segments are no longer delimitable in reduced ...
متن کاملInterruption glottalization in German spontaneous speech
This paper analyzes the occurrence of phonetic interruption cues at points of syntactic irregularities (false starts and truncations) in a large annotated corpus of German dialogues and compares interruption glottalization with laryngealization in terminal low phrase-final prosodies. Glottalization (including glottal stop) predominantly marks word fragments, whereas non-verbal insertions, e.g. ...
متن کامل